# load package(s)
library(tidyverse)
library(ggthemes)
library(ggplot2)
library(patchwork)
library(cowplot)
library(showtext)
library(ThemePark)
library(sysfonts)
library(dplyr)
# Enable showtext
showtext_auto()
font_add("ArialBold", "~/System/Library/Fonts/Supplemental/Arial Bold.ttf")
#font_add("CamptonBlack", "~/font/CamptonBlack.otf")
# read in the cdc dataset
cdc <- read_delim(file = "data/cdc.txt", delim = "|") |>
mutate(
genhlth = factor(
genhlth,
levels = c("excellent", "very good", "good", "fair", "poor"),
labels = c("Excellent", "Very Good", "Good", "Fair", "Poor")
)
)
# read in NU admission data
nu_admission_data <- read_csv("data/NU_admission_data.csv") |>
janitor::clean_names()
# set seed
set.seed(2468)
# selecting a random subset of size 100
cdc_small <- cdc |>
slice_sample(n = 100)L09 Themes
Data Visualization (STAT 302)
Overview
The goal of this lab is to play around with the theme options in ggplot2.
Datasets
We’ll be using the cdc.txt and NU_admission_data.csv datasets.
Exercise 1
Use the cdc_small dataset to explore several pre-set ggthemes. The code below constructs the familiar scatterplot of weight by height and stores it in plot_01. Display plot_01 to observe the default theme. Explore/apply, and display at least 7 other pre-set themes from the ggplot2 or ggthemes package. Don’t worry about making adjustments to the figures under the new themes. Just get a sense of what the themes are doing to the original figure plot_01.
There should be at least 8 plots for this task, plot_01 is pictured below. Use patchwork or cowplot in combination with R yaml chunk options fig-height and fig-width (out-width and fig-align may be useful as well) to setup the 8 plots together in a user friendly arrangement.
Code
# plot
plot_01 <- ggplot(
data = cdc_small,
aes(x = height, y = weight)
) +
geom_point(size = 3, aes(shape = genhlth, color = genhlth)) +
scale_y_continuous(
name = "Weight in Pounds",
limits = c(100, 275),
breaks = seq(100, 275, 25),
trans = "log10",
labels = scales::label_number(
accuracy = 1,
suffix = " lbs"
)
) +
scale_x_continuous(
name = "Height in Inches",
limits = c(60, 80),
breaks = seq(60, 80, 5),
labels = scales::label_number(accuracy = 1, suffix = " in")
) +
scale_shape_manual(
name = "Health?",
labels = c(
"Excellent", "Very Good",
"Good", "Fair", "Poor"
),
values = c(17, 19, 15, 9, 4)
) +
scale_color_brewer(
name = "Health?",
labels = c(
"Excellent", "Very Good",
"Good", "Fair", "Poor"
),
palette = "Set1"
) +
theme(
legend.position = "inside",
legend.position.inside = c(1, 0),
legend.justification = c(1, 0)
) +
labs(title = "CDC BRFSS: Weight by Height")
plot_01Which theme or themes do you particularly like? Why?
Exercise 2
Using plot_01 from Exercise 1 and the theme() function, attempt to construct the ugliest plot possible (example pictured below). Be creative! It should NOT look exactly like the example. Since the goal is to understand a variety of adjustments, you should use a minimum of 10 different manual adjustments within theme().
Exercise 3
We will be making use of your code from Exercise 3 on L07 Layers. Using the NU_admission_data.csv you created two separate plots derived from the single plot depicted in undergraduate-admissions-statistics.pdf. Style these plots so they follow a “Northwestern” theme. You are welcome to display the plots separately OR design a layout that displays both together (likely one stacked above the other).
Check out the following webpages to help create your Northwestern theme:
- Visual Identity
- Color Palettes
- Fonts & Typography — Need to use substitute fonts
Additional requirement:
Use a free non-standard font from google for the title. Pick one that looks similar to a Northwestern font.
I find this blog post to be extremely useful for adding fonts. Important packages for using non-standard fonts are showtext, extrafont, extrafontdb, and sysfonts. The last 3 generally just need to be installed (not loaded per session).
Challenge
Using cdc_small dataset, re-create your own version inspired by the plot below.
Must haves:
- Use two non-standard fonts (one for labeling the point and the other for the axes)
- Use at least two colors (one for the added point, another for the rest of the points)
- A curved arrow used to label the point
Using Bilbo Baggins’ responses below to the CDC BRFSS questions, add Bilbo’s data point to a scatterplot of weight by height.
genhlth- How would you rate your general health? fairexerany- Have you exercised in the past month? 1=yeshlthplan- Do you have some form of health coverage? 0=nosmoke100- Have you smoked at least 100 cigarettes in your life time? 1=yesheight- height in inches: 46weight- weight in pounds: 120wtdesire- weight desired in pounds: 120age- in years: 45gender- m for males and f for females: m
Adding non-standard fonts can be an adventure. I find this blog post to be extremely useful for adding fonts. Important packages for using non-standard fonts are showtext, extrafont, extrafontdb, and sysfonts. The last 3 generally just need to be installed (not loaded per session).
Hint:
- Create a new dataset (maybe call it
bilboorbilbo_baggins) using eitherdata.frame()(base R - example in book) ortibble()(tidyverse - see help documentation for the function). Make sure to use variable names that exactly matchcdc’s variable names. We have provided thetidyverseapproach. - Search google fonts to find some free fonts to use (can get free fonts from other locations)